Dissimilarity Plots: A Visual Exploration Tool for Partitional Clustering

نویسندگان

  • Michael Hahsler
  • Kurt Hornik
چکیده

For hierarchical clustering, dendrograms provide convenient and powerful visualization. Although many visualization methods have been suggested for partitional clustering, their usefulness deteriorates quickly with increasing dimensionality of the data and/or they fail to represent structure between and within clusters simultaneously. In this paper we extend (dissimilarity) matrix shading with several reordering steps based on seriation. Both methods, matrix shading and seriation, have been well-known for a long time. However, only recent algorithmic improvements allow to use seriation for larger problems. Furthermore, seriation is used in a novel stepwise process (within each cluster and between clusters) which leads to a visualization technique that is independent of the dimensionality of the data. A big advantage is that it presents the structure between clusters and the micro-structure within clusters in one concise plot. This not only allows for judging cluster quality but also makes mis-specification of the number of clusters apparent. We give a detailed discussion of the construction of dissimilarity plots and demonstrate their usefulness with several examples.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Dissimilarity Plots:

For hierarchical clustering, dendrograms provide convenient and powerful visualization. Although many visualization methods have been suggested for partitional clustering, their usefulness deteriorates quickly with increasing dimensionality of the data and/or they fail to represent structure between and within clusters simultaneously. In this paper we extend (dissimilarity) matrix shading with ...

متن کامل

Similarity Measures and Clustering of String Patterns

Clustering is a powerful tool in revealing the intrinsic organization of data. A clustering of structural patterns consists of an unsupervised association of data based on the similarity of their structures and primitives. This chapter addresses the problem of structural clustering, and presents an overview of similarity measures used in this context. The distinction between string matching and...

متن کامل

On Data-Independent Properties for Density-Based Dissimilarity Measures in Hybrid Clustering

Hybrid clustering combines partitional and hierarchical clustering for computational effectiveness and versatility in cluster shape. In such clustering, a dissimilarity measure plays a crucial role in the hierarchical merging. The dissimilarity measure has great impact on the final clustering, and data-independent properties are needed to choose the right dissimilarity measure for the problem a...

متن کامل

Partitional Clustering Experiments with News Documents

We have carried out experiments in clustering a news corpus. In these experiments we have used two partitional methods varying two different parameters of the clustering tool. In addition, we have worked with the whole document (news) and with representative parts of the document. We have obtained good results working with a representative part of the document. The experiments have been carried...

متن کامل

Bias-correction fuzzy clustering algorithms

Keywords: Cluster analysis Fuzzy clustering Fuzzy c-means (FCM) Initialization Bias correction Probability weight a b s t r a c t Fuzzy clustering is generally an extension of hard clustering and it is based on fuzzy membership partitions. In fuzzy clustering, the fuzzy c-means (FCM) algorithm is the most commonly used clustering method. Numerous studies have presented various generalizations o...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009